178 research outputs found
Guaranteeing no interaction between functional dependencies and tree-like inclusion dependencies
Functional dependencies (FDs) and inclusion dependencies (INDs) are the most fundamental integrity constraints that arise in practice in relational databases. A given set of FDs does not interact with a given set of INDs if logical implication of any FD can be determined solely by the given set of FDs, and logical implication of any IND can be determined solely by the given set of INDs. The set of tree-like INDs constitutes a useful subclass of INDs whose implication problem is polynomial time decidable. We exhibit a necessary and sufficient condition for a set of FDs and tree-like INDs not to interact; this condition can be tested in polynomial time
Kemeny's constant and the random surfer
We revisit Kemeny's constant in the context of Web navigation, also known as "surfing." We generalize the constant, derive upper and lower bounds on it, and give it a novel interpretation in terms of the number of links a random surfer will follow to reach his final destination
Why is the snowflake schema a good data warehouse design?
Database design for data warehouses is based on the notion of the snowflake schema and its important special case, the star schema. The snowflake schema represents a dimensional model which is composed of a central fact table and a set of constituent dimension tables which can be further broken up into subdimension tables. We formalise the concept of a snowflake schema in terms of an acyclic database schema whose join tree satisfies certain structural properties. We then define a normal form for snowflake schemas which captures its intuitive meaning with respect to a set of functional and inclusion dependencies. We show that snowflake schemas in this normal form are independent as well as separable when the relation schemas are pairwise incomparable. This implies that relations in the data warehouse can be updated independently of each other as long as referential integrity is maintained. In addition, we show that a data warehouse in snowflake normal form can be queried by joining the relation over the fact table with the relations over its dimension and subdimension tables. We also examine an information-theoretic interpretation of the snowflake schema and show that the redundancy of the primary key of the fact table is zero
A Stochastic Evolutionary Growth Model for Social Networks
We present a stochastic model for a social network, where new actors may join
the network, existing actors may become inactive and, at a later stage,
reactivate themselves. Our model captures the evolution of the network,
assuming that actors attain new relations or become active according to the
preferential attachment rule. We derive the mean-field equations for this
stochastic model and show that, asymptotically, the distribution of actors
obeys a power-law distribution. In particular, the model applies to social
networks such as wireless local area networks, where users connect to
access-points, and peer-to-peer networks where users connect to each other. As
a proof of concept, we demonstrate the validity of our model empirically by
analysing a public log containing traces from a wireless network at Dartmouth
College over a period of three years. Analysing the data processed according to
our model, we demonstrate that the distribution of user accesses is
asymptotically a power-law distribution.Comment: 15 pages, 1 figur
Zipf's Law for web surfers
One of the main activities of Web users, known as 'surfing', is to follow links. Lengthy navigation often leads to disorientation when users lose track of the context in which they are navigating and are unsure how to proceed in terms of the goal of their original query. Studying navigation patterns of Web users is thus important, since it can lead us to a better understanding of the problems users face when they are surfing. We derive Zipf's rank frequency law (i.e., an inverse power law) from an absorbing Markov chain model of surfers' behavior assuming that less probable navigation trails are, on average, longer than more probable ones. In our model the probability of a trail is interpreted as the relevance (or 'value') of the trail. We apply our model to two scenarios: in the first the probability of a user terminating the navigation session is independent of the number of links he has followed so far, and in the second the probability of a user terminating the navigation session increases by a constant each time the user follows a link. We analyze these scenarios using two sets of experimental data sets showing that, although the first scenario is only a rough approximation of surfers' behavior, the data is consistent with the second scenario and can thus provide an explanation of surfers' behavior
A Discrete Evolutionary Model for Chess Players' Ratings
The Elo system for rating chess players, also used in other games and sports,
was adopted by the World Chess Federation over four decades ago. Although not
without controversy, it is accepted as generally reliable and provides a method
for assessing players' strengths and ranking them in official tournaments.
It is generally accepted that the distribution of players' rating data is
approximately normal but, to date, no stochastic model of how the distribution
might have arisen has been proposed. We propose such an evolutionary stochastic
model, which models the arrival of players into the rating pool, the games they
play against each other, and how the results of these games affect their
ratings. Using a continuous approximation to the discrete model, we derive the
distribution for players' ratings at time as a normal distribution, where
the variance increases in time as a logarithmic function of . We validate
the model using published rating data from 2007 to 2010, showing that the
parameters obtained from the data can be recovered through simulations of the
stochastic model.
The distribution of players' ratings is only approximately normal and has
been shown to have a small negative skew. We show how to modify our
evolutionary stochastic model to take this skewness into account, and we
validate the modified model using the published official rating data.Comment: 17 pages, 4 figure
An Approach to Enable Cloud-Computing by the Abstraction of Event-Processing Classes
Following our introduction of the concept ofAbstraction Classes, we present herein their realisation withina cloud environment. This is achieved using a combinationof integrated service-location models, including Knowledge-Based Systems, and distributed metadata using XML. This iscomplemented by service control software invoked at the level ofAbstraction Classes
A stochastic evolutionary model for survival dynamics
The recent interest in human dynamics has led researchers to investigate the stochastic
processes that explain human behaviour in different contexts. Here we propose a generative
model to capture the essential dynamics of survival analysis, traditionally employed in
clinical trials and reliability analysis in engineering. In our model, the only implicit assumption
made is that the longer an actor has been in the system, the more likely it is to have
failed. We derive a power-law distribution for the process and provide preliminary empirical
evidence for the validity of the model from two well-known survival analysis data sets
A bi-logistic growth model for conference registration with an early bird deadline
The recent interest in human dynamics has led researchers to investigate the processes that explain human behaviour within different contexts. Here we are concerned
in modelling the human response to a deadline, and in particular we look at the process of conference registration with an early bird deadline. We provide empirical evidence
from a six-year conference registration data set that the bi-logistic growth function, with the interpretation as registration with an early bird deadline, can be viewed as a social mechanism
- …